Do Grammars Minimize Dependency Length?

نویسندگان

  • Daniel Gildea
  • David Temperley
چکیده

A well-established principle of language is that there is a preference for closely related words to be close together in the sentence. This can be expressed as a preference for dependency length minimization (DLM). In this study, we explore quantitatively the degree to which natural languages reflect DLM. We extract the dependencies from natural language text and reorder the words in such a way as to minimize dependency length. Comparing the original text with these optimal linearizations (and also with random linearizations) reveals the degree to which natural language minimizes dependency length. Tests on English data show that English shows a strong effect of DLM, with dependency length much closer to optimal than to random; the optimal English grammar also has many specific features in common with English. In German, too, dependency length is significantly less than random, but the effect is much weaker than in English. We conclude by speculating about some possible reasons for this difference between English and German.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimizing Grammars for Minimum Dependency Length

We examine the problem of choosing word order for a set of dependency trees so as to minimize total dependency length. We present an algorithm for computing the optimal layout of a single tree as well as a numerical method for optimizing a grammar of orderings over a set of dependency types. A grammar generated by minimizing dependency length in unordered trees from the Penn Treebank is found t...

متن کامل

Minimal-length linearizations for mildly context-sensitive dependency trees

The extent to which the organization of natural language grammars reflects a drive to minimize dependency length remains little explored. We present the first algorithm polynomial-time in sentence length for obtaining the minimal-length linearization of a dependency tree subject to constraints of mild context sensitivity. For the minimally contextsensitive case of gap-degree 1 dependency trees,...

متن کامل

Categorial Dependency Grammars with Iterated Sequences

Some dependency treebanks use special sequences of dependencies where main arguments are mixed with separators. Classical Categorial Dependency Grammars (CDG) do not allow this construction because iterative dependency types only introduce the iterations of the same dependency. An extension of CDG is defined here that introduces a new construction for repeatable sequences of one or several depe...

متن کامل

Writing Weighted Constraints for Large Dependency Grammars

Implementing dependency grammar as a set of defeasible declarative rules has fundamental advantages such as expressiveness, automatic disambiguation, and robustness. Although an implementation and a successful large-scale grammar of German are available, so far the construction of constraint dependency grammars has not been described at length. We report on techniques that were used to write th...

متن کامل

Treebank Grammar Techniques for Non-Projective Dependency Parsing

An open problem in dependency parsing is the accurate and efficient treatment of non-projective structures. We propose to attack this problem using chart-parsing algorithms developed for mildly contextsensitive grammar formalisms. In this paper, we provide two key tools for this approach. First, we show how to reduce nonprojective dependency parsing to parsing with Linear Context-Free Rewriting...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Cognitive science

دوره 34 2  شماره 

صفحات  -

تاریخ انتشار 2010